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Abstract — The capacity of 1-D constraints is given by the 
entropy of a corresponding stationary maxentropic Markov 
chain. Namely, the entropy is maximized over a set of probability 
distributions, which is defined by some linear requirements. In 
this paper, certain aspects of this characterization are extended to 
2-D constraints. The result is a method for calculating an upper 
bound on the capacity of 2-D constraints. 

The key steps are: The maxentropic stationary probability 
distribution on square configurations is considered. A set of 
linear equalities and inequalities is derived from this stationarity. 
The result is a concave program, which can be easily solved 
numerically. Our method improves upon previous upper bounds 
for the capacity of the 2-D "no independent bits" constraint, as 
well as certain 2-D RLL constraints. 

I. Introduction 

Let S be a finite alphabet. A one-dimensional (1-D) con- 
straint is a set S of words over S. For the set S to be called a 
1-D constraint, there must exist an edge-labeled graph G with 
the following property: a word w = wiW2 . . . Wn is in S iff 
there exists a path in G for which the successive edge labels 
are wi,W2, ■ ■ ■ ,Wn (see JD). 

A two dimensional (2-D) constraint over S is a gener- 
alization of a 1-D constraint; it is a set § of rectangular 
configurations over S and is defined through a pair of vertex- 
labeled graphs (GrowjGcoi), where Grow = iV,Erow,L) and 
Gcoi — {V, Eco\, L). Namely, both graphs share the same ver- 
tex set and the same vertex labeling function L : V Y.. The 
constraint S = S(Grow, Gcoi) consists of all finite rectangular 
configurations (wij) over E with the following property: Let 
A be the rectangular index set of j)gA- There exists 

a configuration (wi,j)(ij)eA over the vertex set V such that 
(a) for each G A we have Wij = L{uij); (b) each row 
in (ui.j) is a path in Grow; (c) each column in (u^.j) is a 
path in Gcd. Examples of 2-D constraints include the square 
constraint [2J, 2-D runlength-limited (RLL) constraints [3 |, 2- 
D symmetric runlength-limited (SRLL) constraints (|4), and the 
"no isolated bits" constraint |5|. 

Let § be a given 2-D constraint over a finite alphabet E. 
Denote by l]'*^^^ the set of M x N configurations over E, 
and let 

Sm,n = s n e^^ ^ ^ , Sa/ = s n E*^ >^ . 
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The capacity of § is equal to 

cap(§) = lim •log2 |Sm| ■ (1) 

In this paper, we show a method for calculating an upper 
bound on cap(§). Two other methods of calculating an upper 
bound on the capacity of a 2-D constraint are the following: 
The first method is the so called "stripe method," in which we 
fix a positive integer N, and bound cap(§) by 

cap(S) < lim / -logs \Sm,n\ ■ (2) 

Namely, we consider only stripes of width N, and essentially 
get a 1-D constraint (since we may regard each of the possible 
row values as a letter in an auxiliary alphabet). The RHS of 
(|2|i is easily calculated for modest values of A^: Let G be the 
edge-labeled graph corresponding to the 1-D constraint, and 
let Ag be the adjacency matrix of G. Denote by X{Ag) the 
Perron eigenvalue of Ac- By 1 1 §3.2], the RHS of ^ is equal 
to X{Ag). The second method for upper-bounding cap(§) is 
the generalization presented by Forchhammer and Justesen f6\ 
to the method of Calkin and Wilf |7|. 

The capacity of a given 1-D constraint is known to be 
equal to the value of an optimization program, where the 
optimization is on the entropy of a certain stationary Markov 
chain, and is carried out over the conditional probabilities of 
that chain (see |T, §3.2.3]). We try to extend certain aspects 
of this characterization of capacity to 2-D constraints. What 
results is a (generally non-tight) upper bound on cap(§). 

The structure of this paper is as follows. In Section |ll] 
we set up some notation. Then, in Section |lll] we show 
the existence of a certain stationary random variable taking 
values on Em and having entropy approaching the capacity 
of §, as M — * oo. We then consider a relatively small sub- 
configuration of that random variable, and denote it by X^*^). 
The section concludes with an upper bound on the capacity 
of §, which is a function of the probability distribution of 
X^'^'l In SectionHV] we derive a set of linear equations which 
hold on the probability distribution of X^-^'-'\ In Section [Vl 
we argue as follows: The bound derived in Section [Hi] is a 
function of the probability distribution of X^^'^\ which we do 
not know how to calculate; however, by Section |IV] we know 
that this probability distribution is subject to a set of linear 
requirements. Thus, we formalize an optimization problem, 
where the unknown probability distribution is replaced by 



a set of variables, subject to the above-mentioned linear 
requirements. The maximum of this optimization problem is 
an upper bound on the capacity of S. We then show that 
this optimization problem is easily solved, since it is an 
instance of convex programming. In Section [Vll we show our 
computational results. Finally, in Section IVIII we present an 
asymptotic analysis of our method. 

We note at this point that although this paper deals with 2-D 
constraints, our method can be easily generalized to higher 
dimensions as well. 

II. Notation 
This section is devoted to setting up some notation. 

A. Index sets and configurations 

Denote the set of integers by Z. A (2-D) index set U C 
is a set of integer pairs. A 2-D configuration over E with an 
index set U is a function w : U ^ S. We denote such a 
configuration as w — (wi j)(i.j)GU' where for all £ U, 

we have that Wij £ S. In this paper, index sets will always 
be denoted by upper-case Greek letters or upper-case Roman 
letters in the sans-serif font. Since many of our configurations 
will be M X N, we have set aside special notation for their 
index sets; let 

Bmm - {{ij) ■.0<i<M , 0<j<N}. 

Also, denote 

Bm = Bm.m = ■.0<ij <M} . 

For integers a, (3 we denote the shifting of U by {a, (3) as 

a„,^(U) = {(z + a,j+/3):(«,j)eU} . 

Moreover, by abuse of notation, let (Ta^piw) be the shifted 
configuration (with index set a {[))): 

For a configuration w with index set U, and an index set V C 
U, denote the restriction of w to V by w[V] = {w[y]i,j){ij)£v\ 
namely, 

w[\/]i,j — Wij , where (i, j) £ V . 
We denote the restriction of S to U by S[U]: 
§[U] = {w : there exists w' £ § such that ti/[U] = w} . (3) 

B. Strict total order 

A strict total order -< is a relation on x Z^, satisfying 
the following conditions for all (ii, ji), {i2,j2), (isijs) G 
. If (iiji) ^ («2,j2), then either -< (12,^2) or 

(«2,i2) -<{ii,ji), but not both. 

* If (ii, ji) = (i2,j2), then neither (ii, ji) -< (12,^2) nor 

(«2, j2) < (iljl)- 

• If -< {i2,h) and (12,^2) -< then 
(ji,ji) < ihja)- 

For e Z^, define T^^-* as all the indexes preceding (i, j). 
Namely, 



C. Entropy 

Let X and Y be two random variables. Denote 

Px = Prob(X — x) . 

and 

Pyl^ = Prob(X = x,Y ^ y)/Pi-oh{X ^ x) . 
The entropy of X is denoted by H{X) and is equal to 

X 

where the sum is on all x for which Prob(X — x) is positive. 
Similarly, we define the conditional entropy H{Y\X) as 

H{Y\X) ^^Px^Py\x^OgPylx , 

X y 

where we sum on all x for which p^ is positive and all y for 
which py^x is positive. 

III. A PRELIMINARY UPPER BOUND ON cap(S) 

Let M be a positive integer and let be a random variable 
taking values on §m- We say that W is stationary if for all 
U C Bm, all a,/3 e Z such that crQ,/3(U) C Bm, and all 
w' e §[U], we have that 

Prob(P^[U] = w') = Prob(VF[CTo,;3(U)] = cTa./jK)) • 

The following is a corollary of [l8l, Theorem L4]. The proof 
is given in the Appendix. 

Theorem 1: There exists a series of random variables 
(W^^*^^)m=i with the following properties: (i) Each VF**^) 
takes values on Sm- (ii) The probability distribution of W'^^'^^ 
is stationary, (iii) The normalized entropy of approaches 
cap(§), 

cap(S)= lim ^ . . (4) 

We now proceed towards deriving Lemma |2] below, which 
gives an upper bound on cap(§), and makes use of the 
stationarity property. We note in advance that this bound is 
not actually meant to be calculated. Thus, its utility will be 
made clear in the following sections. In order to enhance the 
exposition, we accompany the derivation with two running 
examples. 

Running Example I: Define the lexicographic order ^lex 
as follows: -<iox («2, ^2) iff 

• ii < 12, or 

. (ii = 12 and ji < 22)- 

Running Example II: Define the "interleaved raster scan" 
order as follows: (ii, ji) {12,32) iff 

• ii = (mod 2) and ^2 = 1 (mod 2), or 

• ii = 12 (mod 2) and i\ < 12, or 
. ii = 12 and ji < j2. 

(See Figure [U for both examples.) 

For the rest of this section, fix positive integers r and s, and 
define the index set 

A = Br^s ■ 
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Fig. 1. An entry labeled i in the left (right) configuration precedes an entry 
labeled j according to ^icx (^irs) iff * < j- 



We will refer to A as "the patch." The bound we derive in 
Lemma |2] will be a function of the following: 
> the strict total order ^, 

• the integers r and s, which determine the order r x s of 
the patch A, 

• an integer c, which will denote the number of "colors" 
we encounter, 

• a coloring function / : Z'^ ^ {1, 2, . . . , c}, mapping each 
point in to one of c colors, 

• c indexes, {a^, by)^^^^, such that for all 1 < 7 < c, 

{uj, bj) G A 

(namely, each color 7 has a designated point in the patch, 
which may or may not be of color 7). 
The function / must satisfy two requirements, which we 
now elaborate on. Our first requirement is: for all 1 < 7 < c, 

{(»,j)£Ba, :/(^,J^)=7} ^ 1 
M2 c ' 



lim 



(5) 



Namely, as the orders of tend to infinity, each color is 

equalljo likely. Our second requirement is as follows: there 
exist index sets ^^i, ^2, ■ • ■ , ^ A such that for all indexes 



(6) 



where 7 = /(?, j), i' 



'(A), 

i, and j' ^ b^ — j. Namely, let 
be such that f{i,j) = 7, and shift A such that (a^,6^) 
is shifted to («, j). Now, consider the set of all indexes in the 
shifted A which precede this set must be equal to the 

correspondingly shifted ^-y. 

Running Example I: Take r — 4 and s = 7 as the patch 
orders. Let the number of colors be c = 1. Thus, we must 
define / = /icx as follows: for all e Z^ /icx(i, j) = 1- 
Take the point corresponding to the single color as (ai = 
3,61 = 5). See also Figure |2j a). 

Running Example II: As in the previous example, take 
r = 3 and s = 5 as the patch orders. Let the number of colors 
be c = 2. Define / — firs as follows: 



/irs(i, j) 



1 i = (mod 2) 

2 i = l (mod 2) 



'in fact, it is possible to generalize (S), and require only that the limit exists 
for all 7. We have not found this generalization useful. 



7 = 2 



(a) 



(b) 



Fig. 2. The left (right) column corresponds to Running Example I (II). The 
configurations are of order r X s and represent the index set A. The • symbol 
is in position {aj,b-y). The shaded part is >&-y. 



Take (ai = 3, 61 = 5) and (02 = 2,62 = 4). See also 
Figure |2b). 

Lemma 2: Let (T^(*^^)m=i be as in Theorem [T| and 
define 

^(M) ^^(M)[^] 

Let ^, r, s, c, /, {'i'j)'y^i, and {aj,bj)'^^i be given. For 
1 < 7 < c, define 



T..^ = {{a^,b^)} U 



Let 



= X(*^)[T^] and = x'-^'^^^] 
(note that Yy and are functions of M). Then, 

1 " 

cap(§) < limsup-^i/(y^|Z^) . 



7=1 



Proof: Let X, W and 7^,^ be shorthand for AT^*^), T4^(*^) 
and T^^-*, respectively. First note that 



= W[T^] and = W[^.y] 



We show that 



1 1 
lim —H{W) < limsup - V HiYJZ^) . 

7=1 

Once this is proved, the claim follows from (|4|l. 
By the chain rule ID Theorem 2.5.1], we have 

H{W)^ ^ i7(W^.,,|iy[T,,,nBM]). 

(ij)eBM 

We now recall ^ and define the index set d to be the largest 
subset of Bm for which the following condition holds: for all 
(i, j) G d, we have that 



<Ji'.ji{^-f) C Bm , 



(7) 



where hereafter in the proof, 7 = f{i,j), i' — — i, and 
j' = — j. Define d = Bm \ d. Note that since r and s are 
constant, and ^'i, \E'2, • • • , C A, then 

\d\ 



M2 



0(1 /M) . 



Thus, on the one hand, we have 
^ J2 H{W,,,\W[T.,,nBM]) <log^\n ■ Oa/M) . 

On the other hand, from ^ and (|7]i we have that for all G 

B, 

o-j'j'(*^) C T,j n Bm . 
Hence, since conditioning reduces entropy lH Theorem 2.6.5], 



^ E ^^(W^.,.,W[T,,, n Bm]) 



(»,i)e5 



(*,i)e9 

where the last step follows from the stationarity of 
Recalling (|5]i, the proof follows. ■ 
The following is a simple corollary of Lemma |2] 

Corollary 3: Let (Ty(*^^)M=i be as in Theorem [T] and 
define 

Fix positive integers r and s. Let ^ be a positive integer, and 
let be non-negative reals such that Yfk=i P^^'' = ^■ 

For every 1 < fc < ^, let c<'=>, f'^^K {"H^-^^f^^i, and 

{ay''\ by'^'^ be given. Also, for 1 < 7 < c^'^^ let 



Define 



yW^X(^^)[Tf] and =X(^'^)[vI/W] 
(note that Y^^'''' and are functions of AI). Then, 

cap(§) <limsupE^E^(^7'^l4'^) • 



A;— 1 



7=1 



Corollary [3] is the most general way we have found to state our 
results. This generality will indeed help us later on. However, 
almost none of the intuition is lost if the reader has in mind 
the much simpler case of 

(af\5f>) = (r-l,t) , and = AnT(^a> ,^a>) , (8) 

where < t < s. This simpler case was demonstrated in 
Running Example I. 



IV. Linear requirements 

Recall that X^^^'> = T4^(*^) [A] is an r x s sub-configuration 
of W'^'^^K and thus stationary as well. In this section, we for- 
mulate a set of linear requirements (equalities and inequalities) 
on the probability distribution of X^^'^\ For the rest of this 
section, let M be fixed and let X be shorthand for X'-^^\ 

A. Linear requirements from stationarity 

In this subsection, we formulate a set of linear requirements 
that follow from the stationarity of X^^'^\ Let x E S[A] be a 
realization of X. Denote 

Px = Prob(X = x) . 

We start with the trivial requirements. Obviously, we must 
have for all a; G S[A] that 



^ E H{W[{{^,J)}U<Je,A'^l)]m'y^',ri^^)]) Also, 



Px>0 



E = 1 ■ 

i;GS[A] 



Next, we show how we can use stationarity to get more 
linear equations on {px)xes[A]- Let 

A' = :0<i<r-l, 0<j<s}. 

For x' G §[A'] we must have by stationarity that 

Prob(X[A'] =. x') - Prob(X[ai,o(A')] = ctiA^')) ■ (9) 

As a concrete example, suppose that r — s — 3. We claim 
that 



100 

Prob { X = 1 ] = Prob [X = 100 

* * / V 1 



where * denotes "don't care". 

Both the left-hand and right-hand sides of (|9|l are marginal- 
izations of {px)x- Thus, we get a set of linear equations on 
{Px)x, namely, for all x' G S[A'], 

E p-= E 

X : x[A']—x' X : a;[(Ti^o{A')] — (Ti^o{a^') 

To get more equations, we now apply the same rational 
horizontally, instead of vertically. Let 

A" = -.OKi <r , 0<j<s-l}. 

for all x" G §[A"], 

E = E ■ 

X : x[A"]—x" X : a;[cro^i (A")] — ctq, 1 (x") 



B. Linear equations from reflection, transposition, and com- 
plementation 

We now show that if S is reflection, transposition, or 
complementation invariant (defined below), then we can derive 
yet more linear equations. 

Define as the vertical (horizontal) reflec- 

tion of a rectangular configuration with M rows (columns). 
Namely, 



{vM{w))ij = Wm- 



, and (/im(w)),;j = • 
Define r as the transposition of a configuration. Namely, 

For S = {0,1}, denote by comp(?x;) the bitwise comple- 
ment of a configuration w. Namely, 



comp(u))i 



1 if Wi_j ~ 
otherwise . 



We state three similar lemmas, and prove the first. The proof 
of the other two is similar 

Lemma 4: Suppose that § is such that for all Af > and 



hM{w) e § 



vm{w) e § 



Then, w.l.o.g., the probabiUty distribution of W is such that 

for all w G Sj\/, 

Prob(M^ = w) = 

¥voh{W = hM{w)) = Vvob{W = vm{w)) . (10) 

Lemma 5: Suppose that § is such that for all Af > and 

w G § ^ t{w) G § . 



Then, w.l.o.g., W is such that for all w G 

Prob(M^ = w)= Prob(VF = t{w)) 



(11) 



Lemma 6: Suppose that E 

for all Af > and w G E^-fx^^, 



{0, 1} and S is such that 



w E § <==^ comp(u;) G S . 
Then, w.l.o.g., W is such that for all w G Sai, 

Prob(VF — w) — Prob(VF = comp(w)) . 



(12) 



Proof of Lemma \5} Let h and v be shorthand for 
and V]\j, respectively. For Af fixed, we define a new random 
variable T/F"°^ taking values on with the following 

distribution: for all w G §m, 

Prob(W^"™=w) Pioh{W=w') . 



{w .h{w) ,v{w) .h{v{w))'\ 



Since h[h{w)) ~ v{v{w)) 
get that ^ holds for W" 
the entropy function. 



w and h{v{w)) = v{h{w)) we 
. Moreover, by the concavity of 



H{W) < ff (M/"™) 



Thus, the properties defined in Theorem [T| hold for ■ 
If the condition of Lemma|4]holds, then we get the following 
equations by stationarity. For all x G S[A], 

If the condition of Lemma|5]holds, then the following holds 
by stationarity. Assume w.l.o.g. that r < s, and let 

A - : < i,j < r} . 

For an X G §[A], 

x:xlA]=x x:x[A]=t{x) 

If the condition of Lemma|6]holds, then we get the following 
equations by stationarity. For all x G S[A], 

Px — Pcomp(a;) ■ 

V. An UPPER BOUND ON cap(§) 
For the rest of this section, let r, s, £, p'^'^\ ^^'^^ c^^\ J^^\ 



, and [a\^\b\^^) be given as in Corollary [3] Recall from 



Ak) Ak) 



Corollary [3] that we are interested in H{Y^ ''\Z>j''), in order 
to bound cap(§) from above. 

As a first step, we fix Af and express ff (y^^'"^ l-^-y*^^) in terms 
of the probabilities {px)x of the random variable X'^^^\ For 
given 1 <k < I and 1 < 7 < c'^^\ let 

y G ^[T'^J;^ and z G Si*^'^^] 



be realizations of Y-^^'^ and Z~^' , respectively. Let 

^ Prob(y^<'=> = y) and p^'^^ = Prob(Z^'=> = z) 

(p'>^Aj and p'>^)z are functions of Af ). From here onward, let py 
and pz be shorthand for p'>^)y and p^*^!, respectively. Both py 
and pz are marginalizations of {px)x, namely, 

Py= Y Px 1 P 

£ceS[A] : x[T\''^]=y 

Thus, for given 7 and fc. 



E 



is a function of the probabilities {px)x of AT*^^^'. 

Our next step will be to reason as follows: We have 
found linear requirements that the p^'s satisfy and expressed 
as a function of {px)x- However, we do not 
know of a way to actually calculate {px)x- So, instead of the 
probabilities {px)x, consider the variables {px)x- From this 
line of thought we get our main theorem. 

Theorem 7: The value of the optimization program given 
in Figure [3]is an upper bound on cap(§). 

Proof: First, notice that if we take px = Px, then (by 
Section HvT i all the requirements which the p^-'s are subject to 
indeed hold, and the objective function is equal to 

^ (k) c**"* 

E&^E^(^.^^-^i^f)- 

k=l 7=1 



maximize 



7=1 



over the variables {px)x£SlA]^ where for 
l<k<e, l<7<c<*^^ ye§[Tf>], ze§[*f>], 
we define 



xeS[A] :x[1'*,'°*]=z 



and the variables px are subject to the following requirements: 

(i) ^ = 1 . 

xeS[A] 

(ii) For all X e S[A], 

Px>0- 

(iii) For all x' e S[A'], 

E E ^'-^ 

X : x[A']—x' X : x[(Ti^o{A')]—(7i^o{x') 

(iv) For all x" e §[A"], 

(v) (If § is reflection (resp. complementation) invariant) For 

aU X e §[A], 

Px = Pvr{x) = Ph,{x) (resp. Px = Pcomp(x)) • 

(vi) (If § is transposition invariant) For all x £ S[A], 

E ^'^^ = E p^- 

X : x[A]=x x;x[A]=r(x) 



Fig. 3. Optimization program over the variables px (assuming w.I.o.g. that 
r < s). The optimum is an upper bound on cap(§). 



So, the maximum is an upper bound on the above equation. 
Next, by compactness, a maximum indeed exists. Since the 
maximum is not a function of AI, the claim now follows from 
Corollary [3] ■ 
We now proceed to show that the optimization problem 
in Figure [3] is an instance of concave programming ifTol p. 
137], and thus easily calculated. Since the requirements that 
the variables {px)x are subject to are linear, this reduces to 
showing that the objective function is concave in {px)x- 

Lemma 8: The objective function in Figure [3] is concave 
in the variables (px)xeS[A]' subject to them being non-negative. 



Proof: Recall that for all 1 < fc < £ we have that 



and 1 < 7 < c^'^K the function S(fc,7) is concave in the 
variables (j3x)x. So, let k and 7 be fixed, and let py and 
be shorthand for p!^''}y and p^):, respectively. 

Recalling the definition of p!f}y and pj'fz in Figure [3] and the 

c t!^^\ 

S(fc,7) 



fact that '^il''^ C T^''^ we get that 



E -Py ^0S2^- 
it.\ "2 
yeS[Ti,'=>] 

Thus, it suffices to show that each summand is concave in 
iPx)x- This is indeed the case: let {px^)xeSlA] and iP^^)xesiA] 
be non-negative. Let < ^ < 1 be given, and define 

iP^x'')xeS[A] as 

pi'^-iPi'^ + i^^Op'xK xeS[A]. 

For t = 1,2,3, denote by and p^P the marginalizations 
corresponding to (px')x. Obviously, 



and 



We must show that for aU y e §[T^''^], z = y[^\''^] 

-(3) -(1) -(2) 

P?' I0& % < log. ^ + (1 - Opf^ log. ^ . 

Pz Pz Pz 

This is indeed the case, by the log sum inequality [9, p. 29]. 



VI. Computational results 

At this point, we have formulated a concave optimization 
problem, and wish to solve it. There are quite a few programs, 
termed solvers, that enable one to do so. Many such solvers 
— most of them proprietary — are hosted on the servers of 
the NEOS project IIDllEllini, and die pubUc may submit 
moderately sized optimization problems to them. We have 
coded our optimization problems in the AMPL modeling 
language lfT4l . and submitted them to NEOS. 

Essentially, a solver starts with some initial guess as to the 
optimizing value of {px)xesiA], and then iteratively improves 
the value of the objective function. This process is terminated 
when the solver decides that it is "close enough" to the 
optimum. Denote hy p = (px)xes[A] this "close enough" 
assignment to the variables. Of course, we must supply an 
upper bound on cap(S), not an approximation to one. Thus, 
let / and 

g = (9x)x , X e S[A] , 

be the value of the objective function and its gradient at p, 
respectively. Obviously, / is a lower bound on the value of 
our optimization problem. For an upper bouncd, we replace 

^We remark in passing that if we had chosen to optimize the dual problem 
1101 p. 215], then the "dual of" / would already have been an upper bound. 



non-negative. Thus, it suffices to prove that for all 1 < fc < £ However, we have not managed to state the dual problem in closed form. 
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Fig. 4. An entry labeled i in the configuration precedes an entry labeled j 
according to -=(skip 

iff i < j. 



the objective function in Figure |3] by 



maximize 



/ + 



2;GS[A] 



9x ■ (Px -Px) 



and get a linear program (the value of which can be calculated 
exactly). By concavity, the value of this linear program is 
indeed an upper bound. So, we use NEOS yet again to 
solve it. For the sake of double-checking, we submitted the 
above optimization problems to two solvers: IPOPT |15| and 
MOSEK. 

Before stating our computational results, let us first define 
one more strict total order, which we have termed the "skip" 
order, -<skip (see Figure|4]i. We have that (ii, ji) -<skip (12, ^2) 
iff 

» ii < 12, or 

• (ii = 12 and ji = (mod 2) and j2 = 1 (mod 2)), or 

• (ii = 12 and ji = j2 (mod 2) and ji < j2) 

Our computational results appear in Table H] To the best of 
our knowledge, they are presently the tightest. The penultimate 
column contains upper bounds obtained by the method de- 
scribed in [6|. When available, these compared-to bounds are 
taken from previously published work, as indicated to the right 
of them. The rest are the result of our implementation of |6l. 
For reference, the last column contains corresponding lower 
bounds. We note that the indexes {a~^\b~^'^) and coefficients 
p^''^ used for each constraint were optimized by hand, through 
trial and error Also, we note that when applying our method 
to the 2-D (1, oo)-RLL constraint, our bound was inferior to 
the one presented in ||2l (utilizing the method of Q). 

VII. Asymptotic analysis 

For a given constraint § and positive integers r and s, let 
t be an integer such that < t < s. Denote by ii{r,s,t) 
the value of the optimization program in Figure [51 where the 
parameters are as in (O. In this section, we show that even 
if we restrict ourselves to this simple case, we get an upper 
bound which is asymptotically tight, in the following sense. 
Theorem 9: For all e > 0, there exist 

ro > , So > , < to < So 

such that for all 

r >ro , s > So , tQ < t < s - {sq - to) , 

we have that 

fi{r, s, t) — cap(§) < e . 



In order to prove Theorem|9l we need the following lemma. 
Lemma 10: For all e > 0, there exist 



such that 



ro > , So > , < to < So 



Ai('"o, So, to) - cap(§) < e 



Proof: Another well known method for bounding cap(§) 
from above is the so called "stripe method", mentioned in 
the introduction. Namely, for some given 6, consider the 1-D 
constraint S = S{d) defined as follows. The alphabet of the 
constraint is S^. A word of length r' satisfies S if and only if 
when we write its entries as rows of length 9, one below the 
other, we get an r' x 6* configuration which satisfies the 2-D 
constraint §. 

Define the normalized capacity of S as 

cap(S') = ^cap(S') . 

By the definition of cap(§), the normalized capacity ap- 
proaches cap(§) as 6 ^ 00. Thus, fix a such that 

cip(S') - cap(§) < e/2 . 

We say that a 1-D constraint has memory m if there exists 
a graph representing it, and all paths in the graph of length 
m with the same labeling terminate in the same vertex. By 
[1, Theorem 3.17] and its proof, there exists a series of 1-D 
constraints {'S'n,}^^^]^ such that S C Sm, the memory of Sm is 
m, and limm^oo cap(S'm) = cap(5). Thus, fix m such that 

cTp(5^) - cTp(5) < e/2 . 

To finish the proof, we now show that 

lJ'{rQ,so,to) < cap{S^) , 

where 

ro = m + 1 , So = 2 ■ 6* , to = 61 - 1 . 
Note that /^(ro, sq, to) is the maximum of 



iJ(x^,,_i|x[TlrrlnB^+i,2.e]) 



(13) 



over all random variables X G §m+i.2-0 with a probability 
distribution satisfying our linear requirements. 

For all < < 6* we get by the (imposed) stationarity of 
X that ( fTsl ) is bounded from above by 



So, (fTSl) is also bounded from above by 



A) 



9-1 



(14) 



<l>=o 



The first 9 columns of X form a configuration with index set 
Bm+i,e- By our linear requirements, stationarity (specifically, 
vertical stationarity) holds for this configuration as well. So, 
we may define a stationary 1-D Markov chain fl). §3.2.3] on 



TABLE I 

Upper-bounds on the capacity of some 2-D constraints . 



Constraint 


r 


s 


k 


~i used 


Upper bound 


Comparison 


Lower bound 


(2, oo)-RLL 


3 


8 


1 


^Icx' ^skip 


0.4457 


0.4459 |16| 


0.444202 |17 


(3, oo)-RLL 


4 


8 


5 




0.36821 


0.3686 |16| 


0.365623 |18 | 


(0, 2)-RLL 


3 


5 


2 




0.816731 


0.817053 


0.816007 |18 


n.i.b. 


3 


4 


1 




0.92472 


0.927855 


0.922640 1 17 1 



S'm, with entropy given by ( fT4l i. That entropy, in turn, is at 
most cap(S'm). ■ 
Proof of Theorem^ The following inequalities are easily 
verified: 

fi{r,s,t) > ^{r + l,s,t) . 
fi{r,s,t) > fi{r,s + l,t) . 
fi{r,s,t)>fi{r,s + l,t+l) . 
The proof follows from them and Lemma [TOl ■ 
Appendix 

Our goal in this appendix is to prove Theorem[T] Essentially, 
Theorem [U will turn out to be a corollary of |8, Theorem 1.4]. 
However, fS] Theorem 1 .4] deals with configurations in which 
the index set is Z^. So, some definitions and auxiliary lemmas 
are in order. 

Recall that (Grow, Gcoi) is the pair of vertex-labeled graphs 
through which § = S(Grow,Gcoi) is defined. Also, recall 
that each member of § is a configuration with a rectangular 
index set. Namely, the index set of a configuration in § is 
fij (BA/,Af), for some i, j, M, and N. We now give a very 
similar definition to that of §, only now we require that 
the index set of each configuration is Z^. Namely, define 
S = iS(Grow, Gcoi) as follows: A configuration ^ )(i jjg^^ 
over E is in 5(Grow,Gcoi) iff there exists a configuration 
over the vertex set V with the following prop- 
erties: for all G 1?, (a) the labeling of Uij satisfies 
L{uij) — Wij; (b) there exists an edge from Uij to in 
Grow; (c) there exists an edge from Uij to Ui+ij in Gcoi- 

For positive integers M, N > 0, define Sm.n as the 
restriction of S to Bm.n- Namely, 

Sm,n — S[Bj^i^n] , 

where the definition of the restriction operation is as in 
Also, for M equal to N, define 

Sm — Sm,m ■ 

Note that for all M, iV > we have 

Sm,n Q ^m,n , (15) 

and there are cases in which the inclusion is strict. Next, define 
the capacity of S as 

cap(5)= Jim^^.lo&I^Ml ■ 

The limit indeed exists, by sub-additivity (see IS] Appendix], 
and references therein). 



For integers M, N > Q and 5 >Q, denote 

Cm,N,S = (7-S,-siBM+2S,N+2s) 

and let 

Note that the index set Cm.n.s of each element of E>m.n.s is 
simply Bm.at, padded with S columns to the right and left and 
S rows to the top and bottom. The following lemma will help 
us bridge the gap between finite and infinite index sets. 

Lemma 11: Let w be a configuration over the finite 
alphabet E with index set B>m.n- If for all (5 > we have 
that 

W e §Af,iV,5[BA/,Jv] , (16) 

then we must have that 

w e Sm,n ■ 

Proof: Define the following auxiliary directed graph. The 
vertex set is 

y {u) G Sm.n.s ■ w[Bm,n] = w} ■ 
s>o 

For every 6 > 0, there is a directed edge from wi £ E>m,n,s to 
W2 G SA/,Af,5+i iff wi — W2[C]\f^N,5]- It is easily seen that this 
graph is a directed tree with root w, as defined in |19, §2.4]. 
Since (fTSI l holds for all 6 > 0, the vertex set of the tree is 
infinite (and countable). On the other hand, since the alphabet 
size |E| is finite, the out-degree of each vertex is finite. Thus, 
by Konig's Infinity Lemma |19, Theorem 2.8], we must have 
an infinite path in the tree starting from the root w. 

Denote the vertices of the above-mentioned infinite path as 

We now show how to find a configuration (w^ j)(i.j)6Z2 such 
that w' G S and w = w'[Bm,n]- For each G Z^, define 
w'j^ j as follows: let (5 > be such that G Cm,n,s, and 

take w'j^ j = wfj . It is easily seen that w' is well defined and 
contained in S. ■ 
The following lemma states that although the inclusion in 
(flSl l may be strict, the capacities of § and S are equal. 

Lemma 12: Let § and S be as previously defined. Then, 

cap(5) = cap(S) . (17) 

Proof: By ( fTSl l, we must have that cap (5) < cap(S). For 
the other direction, it suffices to prove that for all M > 0, 

cap(§) < 42log2l'5Af| . (18) 



So, let us fix M and prove the above. By Lemma (TT] there 
exists (5 > such that for all w e J^^xai^ 

w 5a./ =^ w ^ S>m,mA^m] ■ 

For t>0, let M' be shorthand for 

M' = t-M . 

By the definition of capacity, we have that 

1 



cap(§)=^lim ^log2 



(19) 



Now, let us partition B^// into the following disjoint sub-sets 
of indexes: for < i, j < t, define the set 

Let w' G §A/'- Notice that for all < i, j < t for which 

C^i-M,j-M{^M,KLS) ^ Ba/' , (20) 

we must have that w'[Dij] is equal to some correspondingly 
shifted element of Sm ■ On the other hand, for M and 6 fixed, 
the number of pairs (i, j) for which ( l20l l does not hold is 0{t). 
Thus, a simple calculation gives us that 

l0g2 |§M' I < ^ l0g2 |5a/| + 0(l/i) . 

This, together with ( fT9] l, proves ( fTSl l. ■ 
For a given M > 0, define the set !F{M) of configurations 
with index set 7? as follows: a configuration {wi,j)(i j)^i2 is 
in J^(Af) iff for all (i, j) e Z^, 

w[crij (Ba/)] e 5Af . 

Namely, each M x M "patch" is a correspondingly shifted 
element of Sm- 

Note that there exist vertex-labeled graphs Grow{M) and 
Gcoi(M) such that T{M) = 5(Grow(M), Gcoi(M)). Specif- 
ically, the vertex set of both graphs is equal to Sm', the label 
of each such vertex is its lower-left entry; there is an edge 
from wi e Sm to W2 G 5a/ in G^ovj{M) (Gcoi(M)) iff the 
first M — 1 rows (columns) of wi are equal to the last M — 1 
(rows) columns of W2- Thus, cap(JF(Af)) exists. Also, since 
w a S implies w G J-{M), we have 

cap(5) < cap(Jf(M)) . (21) 

The following is a direct coroUary of |[8] Theorem 1.4]. 

Corollary 13: For all M > 0, there exists a stationary 
random variable taking values on !F{AI)[Bm] such that 



cap(.F(M)) < -^HiW^^'^) 



(22) 



Proof of Theorem Q} Notice that 

T{M)[Qm]=Sm^^m ■ 

Thus, take W^^'^ as in Corollary [13] and notice that it satisfies 
conditions (i) and (ii) in Theorem [1] From (fTTT i, ( 1211 1. and (l22l i 
we get that 

cap(S)< lim -L . i/(W^(A^)) . 



But since takes values on E>m, we have by ||9] Page 19] 

that the above inequality is in fact an equality. Thus, condition 
(iii) is proved. ■ 
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